AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Text Visual Question Answering

# Text Visual Question Answering

Git Base Textvqa
MIT
A visual question answering model fine-tuned on the textvqa dataset based on microsoft/git-base-textvqa, excelling at handling image-based question answering tasks involving text
Large Language Model Transformers Other
G
Hellraiser24
19
0
Git Large Textvqa
MIT
GIT is a vision-language model based on a Transformer decoder, trained with dual conditioning on CLIP image tokens and text tokens, specifically optimized for TextVQA tasks.
Image-to-Text Transformers Supports Multiple Languages
G
microsoft
62
4
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase